scAWMV: an adaptively weighted multi-view learning framework for the integrative analysis of parallel scRNA-seq and scATAC-seq data

Abstract Motivation Technological advances have enabled us to profile single-cell multi-omics data from the same cells, providing us with an unprecedented opportunity to understand the cellular phenotype and links to its genotype. The available protocols and multi-omics datasets [including parallel single-cell RNA sequencing (scRNA-seq) and single-cell ATAC sequencing (scATAC-seq) data profiled from the same cell] are growing increasingly. However, such data are highly sparse and tend to have high level of noise, making data analysis challenging. The methods that integrate the multi-omics data can potentially improve the capacity of revealing the cellular heterogeneity. Results We propose an adaptively weighted multi-view learning (scAWMV) method for the integrative analysis of parallel scRNA-seq and scATAC-seq data profiled from the same cell. scAWMV considers both the difference in importance across different modalities in multi-omics data and the biological connection of the features in the scRNA-seq and scATAC-seq data. It generates biologically meaningful low-dimensional representations for the transcriptomic and epigenomic profiles via unsupervised learning. Application to four real datasets demonstrates that our framework scAWMV is an efficient method to dissect cellular heterogeneity for single-cell multi-omics data. Availability and implementation The software and datasets are available at https://github.com/pengchengzeng/scAWMV. Supplementary information Supplementary data are available at Bioinformatics online.


Introduction
The recent advances in single-cell technologies have enabled us to probe multiple biological layers. These technologies include single-cell RNA sequencing (scRNA-seq) that profiles transcription, single-cell ATAC sequencing (scATAC-seq) that profiles accessible chromatin regions (Macaulay et al., 2017;Mezger et al., 2018) and other methods. There are increasing demands for computationally efficient methods for processing and analyzing the datasets (Rotem et al., 2015;Rozenblatt-Rosen et al., 2017) brought by these technologies. For example, clustering methods that group similar cells into sub-populations are often used as the first step in the analysis of single-cell genomic data. The clustering methods for scRNA-seq data include SIMLR (Wang et al., 2017), SC3 (Kiselev et al., 2017), SAFE-clustering (Yang et al., 2018), SOUP (Zhu et al., 2019), SHARP (Wan et al., 2020) and other methods. The clustering methods for scATAC-seq data include Cusanovich 2018(Cusanovich et al., 2018, scABC (Zamanighomi et al., 2018), SCALE (Xiong et al., 2019), cisTopic (Gonzalez-Blas et al., 2019) and other methods. However, the aforementioned methods are all designed for analyzing one type of genomic feature.
Recently, the technological advances have enabled us to profile high-dimensional multi-omics data at an unprecedented resolution, and the available protocols and multi-omics datasets are growing at an increasing pace. For example, scM&T-seq (Angermueller et al., 2016) links transcriptional and epigenetic heterogeneity, scNMTseq (Clark et al., 2018) enables joint profiling of chromatin accessibility, DNA methylation and transcription in single cells, and more protocols have been developed for joint profiling of chromatin and transcriptome, including sci-CAR-seq (Cao et al., 2018), scCAT-seq (Liu et al., 2019), Paired-seq (Zhu et al., 2019), SNARE-seq (Chen et al., 2019), SHARE-seq (Ma et al., 2020) and Paired-Tag (Zhu et al., 2021). The resulting single-cell multi-omics datasets can provide insights into the cell's phenotype and links to its genotype (Macaulay et al., 2017). However, such data tend to have high level of noise and are highly sparse (Colomé-Tatché and Theis, 2018). These characteristics bring challenges for analyzing the multi-omics single-cell data. In order to analyze the complex biological process varying across cells, we need to integrate different types of genomic features via flexible but rigorous computational methods. The methods of data integration are growing recently, and they can be divided into three categories: (i) Non-negative matrix factorization (NMF)based methods. coupleNMF (Duren et al., 2018) is based on extensions of NMF, and the connection between chromatin accessibility and gene expression builds upon prediction models trained from bulk data with diverse cell types. LIGER (Welch et al., 2019) implements integrative NMF to infer a shared low-dimensional space in multiple single-cell datasets. scAI (Jin et al., 2020) aggregates epigenomic data in cell subpopulations that exhibit similar gene expression and epigenomic profiles through iterative learning in an unsupervised manner. JSNMF (Ma et al., 2022) is based on jointly semi-orthogonal NMF and it enables effective and accurate integrative analysis of single-cell multiomics data. (ii) Methods based on probabilistic generative models. scACE (Lin et al., 2019) is a modelbased approach to jointly cluster single-cell chromatin accessibility and single-cell gene expression data. It does not rely on training data to connect the two data types and allows for statistical inference of the cluster assignment. MOFAþ (Argelaguet et al., 2020) is based on extensions of factor analysis model and was designed to deal with increasingly large-scale multi-omics datasets. scAMACE (Wangwu et al., 2021) is a model-based approach to the joint analysis of single-cell data on chromatin accessibility, gene expression and methylation, and it develops an efficient expectation-maximization algorithm to perform statistical inference. (iii) Other machine learning-based methods. Seurat (version 3) (Stuart et al., 2019) anchors diverse datasets together with the capability of integrating single-cell measurements not only across scRNA-seq technologies, but also across different data modalities, that is, data types within single cells, for example, scRNA-seq data, scATAC-seq data and DNA methylation data. coupleCoC (Zeng et al., 2020) and coupleCoCþ (Zeng and Lin, 2021) are transfer learning methods based on the information-theoretic co-clustering framework for the integrative analysis of single-cell genomics data. Seurat (version 4) (Hao et al., 2021) introduces the 'weighted nearest neighbor' analysis to learn the relative utility of each genomic feature in each cell, enabling an integrative analysis of multi-omics data.
Two important issues should be taken into account while integrating single-cell multi-omics data. The first issue is how much weight should be assigned to each data modality. The information carried by different types of genomic features are complementary and it is desirable to integrate them. However, the data from different types of genomic features can have different levels of noise. As an example, consider the setting where scRNA-seq and scATAC-seq are profiled on the same cells. scATAC-seq data tend to have higher level of noise and higher degree of sparsity, and it will be intuitive to give smaller weight to scATAC-seq data compared with scRNA-seq data, while integrating the two data modalities. Another issue is how to link data from multiple omics in a way that is biologically meaningful. In the above setting, a subset of features in scATAC-seq data are linked with scRNA-seq data, because promoter accessibility/gene activity score are directly linked with gene expression. Effectively connecting the linked features is expected to be helpful in the integrative analysis of multi-omics data.
In this work, we present scAWMV-an adaptively weighted multi-view learning framework to integrate scRNA-seq data and scATAC-seq data measured from the same cells. Utilizing a unified matrix factorization model, our framework not only automatically assigns a weight to each modality of multi-omics data, but also connects the linked features across data types by adding a constraint (Fig. 1A). Unsupervised learning of this framework will result in biologically meaningful low-rank matrices that represent the transcriptomic and epigenomic profiles. These matrices allow for inferring low-dimensional representations (Fig. 1B), the identification of factor-specific marker genes (Fig. 1C) and the identification of cell types (Fig. 1D). In comparison with the recent multi-omics data (1) reconstruction errors by NMF for the data matrices from scRNA-seq and scATAC-seq, and each factorization is assigned an adaptive weight; (2) the regularization toward a common consensus for all the cell loading matrices; (3) the constraint on the gene loading matrices obtained from the NMF of the linked data, that is, gene expression data matrix in scRNA-seq and gene activity score matrix in scATAC-seq; (4) the penalty term for the adaptive weights. (B) Based on the common latent structure from (A), scAWMV uses Louvain clustering and groups the cells in the same clusters in the heatmap of the common latent structure. (C) scAWMV ranks genes based on the gene loading matrix for scRNA-seq data from (A). For example, genes 1-9 are labeled with the highest loadings. (D) scAWMV assigns cell type labels to cell clusters with known marker genes integration methods in four published datasets, we demonstrate that our framework is efficient in revealing cellular heterogeneity.

The methodology
In this section, we first introduce the frameworks of single-view NMF (Wang and Zhang, 2013) and multi-view NMF (Liu et al., 2013), and then extend them to our framework of adaptively weighted multi-view NMF for single-cell multi-omics data. We treat scATAC-seq data and scRNA-seq data that are measured from the same cells as two views of the single-cell genomic data. We assume that a subset of the features, that is, gene activity score in scATACseq data is linked with the gene expression in scRNA-seq data; and the other features, that is, accessibility of distal peaks in scATACseq data, are not directly linked with the genes in scRNA-seq data. We expect to improve the clustering performance of the cells by (i) integrating these views of data for uncovering the common shared structure; (ii) automatically assigning weights for adjusting for the importance of each view and (iii) adding a constraint to the linked features across data types for considering the biological dependency across different types of genomic features.

Single-view NMF
Let X be a p by n data matrix, representing the data on p features of n samples. A NMF X ¼ WH T gives us a 'soft' clustering of the samples (Duren et al., 2018): the ith column of basis matrix W gives the mean vectors of the ith cluster of samples and the jth row of coefficient matrix H gives the assignment weights of the jth sample to the clusters. The goal of NMF framework (Wang and Zhang, 2013) is to obtain the factorizations by solving the following optimization problem: where jj Á jj F is the Frobenius norm and W ! 0; H ! 0 stand for the constraints that all elements in the matrix are non-negative.

Multi-view NMF
Assume that the data have n views, and let fX ð1Þ ; X ð2Þ ; . . . ; X ðnÞ g denote the data of all the views. Here, for each view X ðvÞ , we have the single-view NMF: X ðvÞ ¼ W ðvÞ ðH ðvÞ Þ T . For different views, we have the same number of samples but allow for different number of features. Thus, H ðvÞ s are of the same shape but W ðvÞ s may differ along row dimension across multiple views. To integrate information from multiple views in clustering, Liu et al. (2013) developed multi-view NMF framework to cluster multiple views simultaneously to uncover the common latent structure shared by multiple views. A sample in different views would be assigned to the same cluster with high likelihood, thus the coefficient matrices H ð1Þ ; H ð2Þ ; . . . ; H ðnÞ learnt from these views would be required to be softly regularized toward a common consensus H Ã . The goal of multi-view NMF framework (Liu et al., 2013) is to obtain the factorizations in multiple views by solving the following optimization problem: argmin ðjjX ðvÞ À W ðvÞ ðH ðvÞ Þ T jj 2 F þ k ðvÞ jjH ðvÞ Q ðvÞ À H Ã jj 2 F Þ; (2) represents the element of the ith row and the kth column in the matrix W ðvÞ and p ðvÞ denotes the number of features for the vth view. Here, Q ðvÞ is a normalization term for making H ðvÞ in multiple views comparable at the same scale (Liu et al., 2013). The hyperparameter k ðvÞ gives the relative weight between the standard NMF reconstruction error jjX ðvÞ À W ðvÞ ðH ðvÞ Þ T jj 2 F and the disagreement between the coefficient matrix and the consensus matrix jjH ðvÞ Q ðvÞ À H Ã jj 2 F for the vth view.

The proposed framework
We now extend the multi-view NMF (Liu et al., 2013) to model singlecell multi-omics data. The gene activity score in scATAC-seq data is linked with gene expression in scRNA-seq data, and the accessibility of distal peaks in scATAC-seq data is not directly linked with the genes in scRNA-seq data. We use the index symbols ðv; 1Þ and ðv; 2Þ to represent the linked part and the unlinked part of data in the vth view, respectively. We write the vth view of data as X ðvÞ ¼ X ðv;1Þ X ðv;2Þ p ðvÞ Ân , where p ðvÞ ¼ p ðv;1Þ þ p ðv;2Þ , representing the sum of the number of features in the linked matrix X ðv;1Þ and that in the unlinked matrix X ðv;2Þ . Here, we assume that ½X ð1;1Þ p ð1;1Þ Ân (i.e. gene expression) is directly linked with ½X ð2;1Þ p ð2;1Þ Ân (i.e. gene activity score), and ½X ð1;2Þ p ð1;2Þ Ân (i.e. gene expression) is not directly linked with ½X ð2;2Þ p ð2;2Þ Ân (i.e. accessibility of distal peaks). We consider these submatrices X ð1;1Þ ; X ð1;2Þ ; X ð2;1Þ and X ð2;2Þ as the four views of single-cell genomic datasets. We propose the following optimization problem (illustrated in Fig. 1A): argmin i;k represents the element of the ith row and the kth column in the matrix W ðv;lÞ . Note that W ð1;1Þ and W ð2;1Þ have the same number of rows, that is, p ð1;1Þ ¼ p ð2;1Þ , under the assumption that X ð1;1Þ is directly linked to X ð2;1Þ , and the number of columns of both W ð1;1Þ and W ð2;1Þ is equal to the number of factors K.
The naive integration of different views by multi-view NMF does not consider the difference in the relative importance across different views. Therefore, we assign each view X ðv;lÞ with an adaptive weight x ðv;lÞ in our framework (Equation 3). c P 2 v¼1 P 2 l¼1 x ðv;lÞ lnx ðv;lÞ represents the penalty term for these weights. c is the hyperparameter which controls the distribution of the adaptive weight: when c is smaller, the view that has less reconstruction error NMF (which suggests that the factors carry more effective information) will be given higher weight. More details for the intuition of c and the weights are provided in Section 2.4. The features in the linked data X ð1;1Þ and X ð2;1Þ are connected, because gene expression is positively associated with gene activity. We further encourage the difference between the basis matrices W ð1;1Þ and W ð2;1Þ to be small. It accounts for the dependence between gene expression in single-cell transcriptomic data and gene activity score in single-cell chromatin accessibility data. The hyperparameter b controls the strength of the constraint jjW ð1;1Þ À W ð2;1Þ jj 2 F . The rules for choosing the hyperparameters k ðv;lÞ ðv; l ¼ 1; 2Þ, b, c and the number of factors K will be discussed in Section 2.5. We call this adaptively weighted multi-view learning framework scAWMV for short.

Optimization algorithm
We optimize the objective function in Equation (3)   where ðÁÞ i;k represents the element of the ith row and the kth column in the matrix ðÁÞ. Both x ðv;lÞ and H Ã have exact solutions in scAWMV each iteration. The intuition for x ðv;lÞ and c becomes clear by looking at the update for x ðv;lÞ : first, the views that have smaller reconstruction error in NMF (i.e. jjX ðv;lÞ À W ðv;lÞ ðH ðv;lÞ Þ T jj 2 F ) will be given higher weight; second, the hyperparameter c acts as a bandwidth parameter. We stop the iterations when the difference of the loss function (Equation 3) is less than 10 À4 between two adjacent iterations. The details for the derivation of these updating rules are given in the Supplementary Materials.

Selection of hyperparameters
We firstly obtain the initial solutionsW using an approach similar to that in Liu et al. (2013), and use them as the initialization for our optimization problem (Equation 3). We initialize the adaptive weightsx ðv;lÞ as 1 4 for v; l ¼ 1; 2. We fix the value of the hyperparameter k ðv;lÞ as 0.01 for v; l ¼ 1; 2, as suggested in multi-view NMF by Liu et al. (2013). We choose the hyperparameter The number of factors K can be determined using an approach similar to Brunet et al. (2004). In this work, we set K ¼ 20 in all the real datasets. We will discuss in great detail the sensitivity of the choices of the hyperparameters in Section 3.4.

Evaluation of the clustering results
We first compute the cluster of cells by Louvain method (Blondel et al., 2008) once we obtain the consensus matrix H Ã , and then evaluate the clustering performance by three criteria: normalized mutual information (NMI), adjusted Rand index (ARI) (Christopher et al., 2008) and the Residual Average Gini Index (RAGI) score (Chen et al., 2019). NMI and ARI require the known ground-truth labels of cells, and we use the cell type labels of gene expression data provided in the real datasets as the ground-truth labels. Assume that G is the known cell type labels and P is the predicted clustering assignments, we then calculate NMI as IðG; PÞ ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi ffi EðGÞ Á EðPÞ p ; where IðG; PÞ represents the mutual information over G and P, and EðÁÞ represents the entropy. Assume that n is the total number of single cells, n Q;i is the number of cells assigned to the ith cluster in Q, n G;j is the number of cells belonging to the jth cell type in G and n i;j is the number of overlapping cells between the ith cluster in Q and the jth cell type in G. As a corrected-for-chance version of the Rand index, ARI is calculated as P ij n i;j 2 À P i n Q;i 2 P j n G;j 2 = n 2 1 2 P i n Q;i 2 þ P j n G;j 2 À P i n Q;i 2 P j n G;j 2 = n 2 : Higher values of NMI and ARI indicate better clustering performance. RAGI score calculates the difference between the variability of marker gene expression across cell clusters and the variability of housekeeping gene expression across cell clusters. We note that no cell type labels are required when computing RAGI score on scRNA-seq data. First, we manually find the marker genes using the CellMarker database developed by Zhang et al. (2019) and find the housekeeping genes using the HRT Atlas v1.0 database developed by Hounkpe et al. (2021). Second, we compute Gini index (Gini, 1997) for each marker gene and each housekeeping gene. The Gini index measures how imbalanced the expression of a gene is across cell clusters. Third, based on the sets of Gini index values for marker genes and housekeeping genes, we calculate the difference between the mean Gini index for marker genes and the mean Gini index for housekeeping genes, that is, the RAGI score. Higher RAGI score represents better separation of the cell clusters.

Data preprocessing and feature selection
In order to get the linked data, we first compute the gene activity score using the gene scoring approach by Chen et al. (2019): the distance-weighted sum of reads (peak values in scATAC-seq data) within or near the region gives the accessibility at each transcription start site. Second, we extract the set of genes that have both gene expression and gene activity score, which can be considered as the linked features. Third, we use the scRNA-seq data to choose 2000 most highly variable genes from this set by R toolkit Seurat (Butler et al., 2018;Stuart et al., 2019). We then obtain the linked data ½X ð1;1Þ 2;000Ân and ½X ð2;1Þ 2;000Ân . We also use the R toolkit Seurat to choose 2000 most highly variable genes from the set of genes that are not included in the linked data, and we have the unlinked scRNA-seq data ½X ð1;2Þ 2;000Ân . We choose 5000 unlinked features in scATAC-seq data, corresponding to the top 5000 largest summation of peak values over all cells, and we have the unlinked scATAC-seq data ½X ð2;2Þ 5;000Ân . We take log transformation for scRNA-seq data to alleviate the effect of extreme values in the data matrices. Before implementing our algorithm, we normalized each of these data matrices, including ½X ð1;1Þ 2;000Ân ; ½X ð2;1Þ 2;000Ân ; ½X ð1;2Þ 2;000Ân and ½X ð2;2Þ 5;000Ân , to make different views of data comparable by dividing the sum of all the entries within that data matrix.

Results
In this section, we evaluated our methodology in four real datasets of single-cell multiome ATAC and gene expression, including peripheral blood mononuclear cell (PBMC) from healthy donorsgranulocytes removed through cell sorting (one dataset has 2711 cells and another one has 11 898 cells), frozen healthy human brain tissue (3233 cells) and fresh frozen lymph node with B-cell lymphoma (14 566 sorted nuclei). All of these datasets are processed by Cell Ranger ARC 2.0.0 and are available at https://www.10xgenom ics.com/. We compared our method scAWMV with scAI (Jin et al., 2020), MOFAþ (Argelaguet et al., 2020), Seurat V4 (Hao et al., 2021) and multi-NMF (Liu et al., 2013). To check whether the constraint on the linked data can improve the clustering performance, we compared a simplified version of scAWMV, where the objective function has no constraint on the linked data (i.e. bjjW ð1;1Þ À W ð2;1Þ jj 2 F ) compared with that in scAWMV, and we denote it as scAWMV-no-link. To check whether the adaptive weights on different views can improve the clustering performance, we compared another simplified version of scAWMV, where the objective function has no adaptive weights compared with that in scAWMV, and we denote it as scAWMV-no-weights. To check whether the linked part in the datasets is helpful in clustering the cells, we also considered the case where the gene activity score of scATAC-seq data (i.e. X ð2;1Þ ) is removed from the input, and we denote it as scAWMV-no-X ð2;1Þ . Except for Seurat V4 (Hao et al., 2021) which can produce clusters of cells by itself, we implemented Louvain clustering on the low-dimensional representation of cells after dimension reduction by the other baseline methods: the cell loading matrix H given by scAI (Jin et al., 2020), the latent factor Z given by MOFAþ (Argelaguet et al., 2020) and the consensus V Ã given by multi-NMF (Liu et al., 2013). (Here, the notations H, Z and V Ã are consistent with that in their original publications.) We used the NMI, ARI and RAGI to evaluate the clustering results (Table 1). When computing the cluster memberships of cells in four real datasets by all methods except Seurat V4, we set the numbers of clusters the same as that given by the secondary analysis outputs from https://www.10xgenomics.com/.

Example 1: application to PBMCs from healthy donors
In the first example, we studied the paired single-cell multiome ATAC and gene expression of cryopreserved human PBMCs from two healthy female donors (Examples 1A and 1B). The numbers of cells in Examples 1A and 1B are 2711 and 11 898, respectively. We set the number of clusters in Example 1A as 9 and set the number of clusters in Example 1B as 17 when implementing Louvain method for clustering. These numbers of clusters are given by the analysis of gene expression data on the 10Â Genomics website. Examples 1A and 1B in Table 1 show the results of clustering the cells. Our method performs the best in NMI, ARI and RAGI among the eight methods in Example 1A, and performs the best in NMI and RAGI among the eight methods in Example 1B. In both Examples 1A and 1B, the value of RAGI given by Seurat V4 ranks the joint first. On the one hand, both scAWMV-no-link and scAWMV-no-weight have slightly worse clustering results compared with scAWMV. This suggests that the removal of the constraint on the linked data or adaptive weights in the objective function of scAWMV has negative impacts on the clustering performance. On the other hand, when we excluded the data matrix X ð2;1Þ from the input datasets of our method, the clustering performance worsens. It suggests that incorporating the gene activity score in scATAC-seq data and the connection between gene activity score and gene expression in scRNA-seq data are helpful in improving the result of clustering the cells.
We compared the heatmaps of the common latent structure ðH Ã Þ T given by scAWMV and the latent structure H given by scAI in Figure 2A and B for Example 1A, and in Figure 3A and B for Example 1B, respectively. They show that the consensus ðH Ã Þ T given by scAWMV can detect more clear patterns than that given by scAI. In Example s1A and 1B, we assigned the cell type labels for all the cell clusters based on marker genes (Supplementary Figs S1 and S2). We then assigned cell types to the factors in the common factor loading matrix ðH Ã Þ T by the following rule: we selected the top 200 entries in each row of ðH Ã Þ T , calculated the proportions of the celltype labels corresponding to these selected entries and chose the cell type with the largest proportion as the cell type for the factors. Factors 1 and 2 in Example 1A correspond to natural killer cells (proportion ¼ 96%) and B cells (proportion ¼ 97%), respectively. Factors 1 and 2 in Example 1B correspond to B cells (proportion ¼ 91%) and natural killer cells (proportion ¼ 89%), respectively. The marker genes for natural killer cells and B cells (Zhang et al., 2019) also tend to have high scores in the first two factors in W ð1;1Þ and W ð2;1Þ (Fig. 2C and D for Example 1A and Fig. 3C and D for Example 1B), which is in agreement with the cell-type annotation for ðH Ã Þ T , and demonstrates that the entries in the feature loading matrices provide biological insight on the factors.
We also studied the gene ontology (GO) enrichment analysis for the feature loading matrices W ð1;1Þ and W ð2;1Þ , which correspond to gene expression data and gene activity score in chromatin accessibility data, respectively. We selected the top 200 linked genes with large values in each column of W ð1;1Þ and W ð2;1Þ , and utilized Metascape (Zhou et al., 2019) to implement GO enrichment analysis. The enriched biological processes and pathways for gene expression data and gene activity score tend to be consistent and they agree with the biological function of the underlying cell types that the factors represent (Supplementary Tables S1 and S2). Factor 1 in Example 1A corresponds to natural killer cells, which is a type of cytotoxic lymphocyte critical to the innate immune system that belong to the rapidly expanding family of innate lymphoid cells and represent 5-20% of all circulating lymphocytes in humans (Arachchige and Shavinda, 2021). As demonstrated in Supplementary Table S1, the first columns in W ð1;1Þ and W ð2;1Þ are both enriched for 'cell activation' (log(q-value) ¼ -14.57 for W ð1;1Þ and log(q-value) ¼ -10.75 for W ð2;1Þ ), 'regulation of cell activation' (log(q-value) ¼ -10.84 for W ð1;1Þ and log(q-value) ¼ -11.31 for W ð2;1Þ ) and 'inflammatory response' (log(q-value) ¼ -6.24 for W ð1;1Þ and log(q-value) ¼ -8.29 for W ð2;1Þ ). Factor 2 in Example 1A corresponds to B cells, which functions in the humoral Notes: NMI and ARI were computed by the cell type labels provided by the analysis of gene expression data on the 10Â Genomics website. RAGI was computed using marker genes and housekeeping genes. The bold numbers represent the best clustering results. (D) Ranking of known cell-type-specific marker genes in factors 1 and 2 from W ð2;1Þ in scAWMV. Note that the marker genes for B cells and natural killer cells in PBMCs are collected from CellMarker database (Zhang et al., 2019) scAWMV immunity component of the adaptive immune system (Murphy, 2012). The second columns in W ð1;1Þ and W ð2;1Þ are both enriched for 'regulation of immune effector process' (log(q-value) ¼ -5.64 for W ð1;1Þ and log(q-value) ¼ -9.07 for W ð2;1Þ ) and 'positive regulation of cytokine production' (log(q-value) ¼ -4.73 for W ð1;1Þ and log(q-value) ¼ -5.40 for W ð2;1Þ ). We observed the same trend in the enrichment analysis for the feature loading matrices W ð1;1Þ and W ð2;1Þ in Example 1B (Supplementary Table S2). These results suggest that the feature loading matrices W ð1;1Þ and W ð2;1Þ boost GO enrichment analysis and provide rich information on the biological interpretation of the factors. Also, the consistent trends in W ð1;1Þ and W ð2;1Þ indicate the effect of the constraint jjW ð1;1Þ À W ð2;1Þ jj 2 F in scAWMV (Equation 3).
3.2 Example 2: application to frozen healthy human brain tissue (3k) In the second example, we explored the paired single-cell multiome ATAC and gene expression of flash frozen healthy human brain tissue (cerebellum). The number of cells is 3233. We set the number of clusters as 8, given by the analysis of gene expression data on the 10Â Genomics website. The clustering results in Example 2 from Table 1 show that scAWMV performs the best in NMI, scAWMVno-X ð2;1Þ performs the best in ARI and Seurat V4 performs the best in RAGI among the eight methods. The clustering performance by scAWMV is slightly better than that by scAWMV-no-link and scAWMV-no-weight. In Figure 4A and B, we compared the heatmaps of the common latent structure ðH Ã Þ T given by scAWMV and the latent structure H given by scAI, respectively. The patterns identified by ðH Ã Þ T in scAWMV are comparable and slightly clearer than that identified by H in scAI. The feature loading matrices W ð1;1Þ and W ð2;1Þ given by scAWMV have consistent trends and tend to be enriched for marker genes of the corresponding cell types ( Fig. 4C and D). In Figure 4C and D, the enrichment for the modality of chromatin accessibility, W ð2;1Þ , tends to be weaker compared with the modality of gene expression, W ð1;1Þ , which is likely due to the higher level of noise in chromatin accessibility data.
3.3 Example 3: application to fresh frozen lymph node with B-cell lymphoma (14k) In the last example, we studied the paired single-cell multiome ATAC and gene expression of flash frozen intra-abdominal lymph node tumor from a patient diagnosed with diffuse small lymphocytic lymphoma. The number of cells is 14 566. We set the number of clusters as 18, given by the analysis of gene expression data on the 10Â Genomics website. Example 3 in Table 1 shows the results of clustering the cells. Our method scAWMV performs the best in NMI, ARI and RAGI among the eight methods, and the value of RAGI given by Seurat V4 ranks the joint first. Compared with scAWMV, both scAWMV-no-link and scAWMV-no-weight have slightly worse clustering results. The result by scAWMV-no-X ð2;1Þ shows that the clustering performance of scAWMV becomes worse when we excluded the data matrix X ð2;1Þ from the input datasets. We compared the heatmaps of the common latent structure ðH Ã Þ T given by scAWMV and H given by scAI in Figure 5A and B. They show that it is hard to detect very clear patterns from the heatmaps of ðH Ã Þ T and H, likely due to the high level of noise in this dataset. In Figure 5C and D, we chose factors 1 and 2 from coefficient matrices W ð1;1Þ and W ð2;1Þ given by scAWMV and demonstrate that known cell type markers tend to have higher rankings in the feature loading matrices. In Figure 5C and D, the enrichment for the modality of chromatin accessibility, W ð2;1Þ , tends to be weaker compared with the modality of gene expression, W ð1;1Þ , which is likely due to the higher level of noise in chromatin accessibility data.

Discussion
First, we note that the results from ARI, NMI and RAGI are not be all consistent with each other (Table 1), since the three evaluation criteria depend on different theories and use different inputs: (i) NMI, ARI and RAGI are based on Shannon information, paircounting and Gini index, respectively, and (ii) both NMI and ARI need the cell type labels as input while RAGI does not. Instead, RAGI needs the list of marker genes and housekeeping genes as input.
Second, in order to study the baseline clustering performance when only one data type, that is, scRNA-seq or scATAC-seq data, is used in all these examples, we compared NMF-only-RNA and NMF-only-ATAC (Supplementary Table S3), which are NMF methods using only scRNA-seq or scATAC-seq data as input. We also summarized the adaptive weights obtained by scAWMV for all views (Supplementary Table S4). The results in Supplementary  Table S3 show that the clustering performance of both NMF-only-RNA and NMF-only-ATAC is worse than scAWMV in all the above examples. It suggests that the integration of two data types is needed and adaptive weights need to be adjusted. Simultaneously analyzing the paired single-cell multiome ATAC and gene expression for frozen healthy human brain tissue in Example 2. (A) Heatmap of the common latent structure ðH Ã Þ T given by scAWMV. We grouped the cells in the same clusters based on Louvain clustering. (B) Heatmap of the latent structure H given by scAI. We grouped the cells in the same clusters based on Louvain clustering. (C) Ranking of known cell-type-specific marker genes in factors 1 and 2 from W ð1;1Þ in scAWMV. (D) Ranking of known cell-type-specific marker genes in factors 1 and 2 from W ð2;1Þ in scAWMV. Note that the marker genes for excitatory neuron and inhibitory neuron in human brain are collected from CellMarker database (Zhang et al., 2019) Fig. 5. Simultaneously analyzing the paired single-cell multiome ATAC and gene expression for fresh frozen lymph node with B-cell lymphoma in Example 3. (A) Heatmap of the common latent structure ðH Ã Þ T given by scAWMV. We grouped the cells in the same clusters based on Louvain clustering. (B) Heatmap of the latent structure H given by scAI. We grouped the cells in the same clusters based on Louvain clustering. (C) Ranking of known cell-type-specific marker genes in factors 1 and 2 from W ð1;1Þ in scAWMV. (D) Ranking of known cell-type-specific marker genes in factors 1 and 2 from W ð2;1Þ in scAWMV. Note that the marker genes for B cells and lymphatic endothelial cells in lymph node are collected from CellMarker database (Zhang et al., 2019) scAWMV